exponential decay property
- North America > United States > California > Los Angeles County > Pasadena (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (3 more...)
- Research Report (0.46)
- Overview (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Communications > Networks (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Scalable spectral representations for multi-agent reinforcement learning in network MDPs
Ren, Zhaolin, Zhang, Runyu, Dai, Bo, Li, Na
Multi-agent network systems have found applications in various societal infrastructures, such as power systems, traffic networks, and smart cities [McArthur et al., 2007, Burmeister et al., 1997, Roscia et al., 2013]. One particularly important class of such problems is the cooperative multi-agent network MDP setting, where agents are embedded in a graph, and each agent has its own local state [Qu et al., 2020b]. In network MDPs, the local state transition probabilities and rewards only depend on the states and actions of the agent's direct neighbors in the graph. Such a property has been observed in a great variety of cooperative network control problems, ranging from thermal control of multizone buildings [Zhang et al., 2016], wireless access control [Zocca, 2019] to phase synchronization in electrical grids [Blaabjerg et al., 2006], where agents typically only need to act and learn based on information within a local neighborhood due to constraints on the information and communication infrastructure.
- Asia > Middle East > Jordan (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.04)
Distributed Policy Gradient for Linear Quadratic Networked Control with Limited Communication Range
This paper proposes a scalable distributed policy gradient method and proves its convergence to near-optimal solution in multi-agent linear quadratic networked systems. The agents engage within a specified network under local communication constraints, implying that each agent can only exchange information with a limited number of neighboring agents. On the underlying graph of the network, each agent implements its control input depending on its nearby neighbors' states in the linear quadratic control setting. We show that it is possible to approximate the exact gradient only using local information. Compared with the centralized optimal controller, the performance gap decreases to zero exponentially as the communication and control ranges increase. We also demonstrate how increasing the communication range enhances system stability in the gradient descent process, thereby elucidating a critical trade-off. The simulation results verify our theoretical findings.
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Scalable Multi-Agent Reinforcement Learning for Networked Systems with Average Reward
Qu, Guannan, Lin, Yiheng, Wierman, Adam, Li, Na
It has long been recognized that multi-agent reinforcement learning (MARL) faces significant scalability issues due to the fact that the size of the state and action spaces are exponentially large in the number of agents. In this paper, we identify a rich class of networked MARL problems where the model exhibits a local dependence structure that allows it to be solved in a scalable manner. Specifically, we propose a Scalable Actor-Critic (SAC) method that can learn a near optimal localized policy for optimizing the average reward with complexity scaling with the state-action space size of local neighborhoods, as opposed to the entire network. Our result centers around identifying and exploiting an exponential decay property that ensures the effect of agents on each other decays exponentially fast in their graph distance.
- North America > United States > California > Los Angeles County > Pasadena (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.34)
Scalable Reinforcement Learning of Localized Policies for Multi-Agent Networked Systems
Qu, Guannan, Wierman, Adam, Li, Na
We study reinforcement learning (RL) in a setting with a network of agents whose states and actions interact in a local manner where the objective is to find localized policies such that the (discounted) global reward is maximized. A fundamental challenge in this setting is that the state-action space size scales exponentially in the number of agents, rendering the problem intractable for large networks. In this paper, we propose a Scalable Actor-Critic (SAC) framework that exploits the network structure and finds a localized policy that is a $O(\rho^\kappa)$-approximation of a stationary point of the objective for some $\rho\in(0,1)$, with complexity that scales with the local state-action space size of the largest $\kappa$-hop neighborhood of the network.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)